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Chapter 1 

Introduction 


Understanding spreading processes in complex networks and designing control strategies to contain them 
are relevant problems in many different settings, such as epidemiology and public health [4], computer 
viruses |23j , information propagation in social networks [44] , or security of cyberphysical networks |73j . In 
this chapter, we describe a bio-inspired framework for optimal allocation of resources to prevent spreading 
processes in complex cyber-physical networks. Our motivation is inspired by recent advancement on the 
problem of containing epidemics in human contact networks. The most popular dynamic epidemic 
model is the Susceptible-Infected-Susceptible (SIS) model [HISS]. In this model, a given population is 
divided into two compartments. The first compartment, called ‘Susceptible’ (S), contains individuals 
who are healthy, but susceptible to becoming infected. The second compartment is called ‘Infected’ 
(/) and contains individuals who are infected and able to recover from the disease. Individuals can 
transition from S to I as they become infected, and from / to S' as they recover. In addition to the 
SIS model, there are many other models able to model more realistic spreading processes. This is 
often done by adding extra compartments representing a variety of disease stages. There are many 
works that analyze different variations of the SIS model, such as extensions to higher number of disease 
states [7S1 [HI [351 [SHI jSSJ (TUI El] , or explicit modeling of birth and mortality rates [501 ESj . Stability 
results are obtained in [5011511155] using Lyapunov analysis, or in m using Volterra integral models. 

In the literature, we find several approaches to model spreading mechanisms in arbitrary contact 
networks. The analysis of this question in arbitrary (undirected) contact networks was first studied by 
Wang et al. [55] for a Susceptible-Infected-Susceptible (SIS) discrete-time model. In [H], Ganesh et al. 
studied the epidemic threshold in a continuous-time SIS spreading processes. In both continuous- and 
discrete-time models, there is a close connection between the speed of the spreading and the spectral 
radius of the network (i.e., the largest eigenvalue of its adjacency matrix) m- Designing strategies to 
contain spreading processes in networks is a central problem in public health and network security. In 
this context, the following question is of particular interest: given a contact network (possibly weighted 
and/or directed) and resources that provide partial protection (e.g., vaccines and/or antidotes), how 
should one distribute these resources throughout the network in a cost-optimal manner to contain the 
spread? This question has been addressed in several papers. Cohen et al. m proposed a heuristic 
vaccination strategy called acquaintance immunization policy and proved it to be much more efficient 
than random vaccine allocation. In [7], Borgs et al. studied theoretical limits in the control of spreads 
in undirected network with a non-homogeneous distribution of antidotes. Chung et al. m studied a 
heuristic immunization strategy based on the PageRank vector of the contact graph. In the control 
systems literature. Wan et al. proposed in EiiiH] a method to design control strategies by allocating 
heterogeneous resources in undirected networks. In m, the authors present an spectral analysis of 
proximity random graphs with applications to virus spread. In [26], the authors study the problem 
of minimizing the level of infection in an undirected network using corrective resources within a given 
budget. In m a linear-fractional optimization program was proposed to compute the optimal investment 
on disease awareness over the nodes of a social network to contain a spreading process. In particular, 
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we will cover in detail the work in [65116611641169] , where the authors developed a convex formulation to 
find the optimal allocation of protective resources in a network. An analysis of greedy control strategies 
and worst-case conditions was presented in m- Recent extensions include the analysis of more general 
epidemic models [52], competing diseases miiMi [50| . time-switching networks |53l |57l |55|, and non- 
Poissonian spreading and recovery rates [541 156] have been recently developed. A novel data-driven 
optimization framework has also been recently proposed by Han et al. in [28] . A distributed framework 
for optimal allocation of resources has also been proposed in [70]. A novel analysis of epidemic models 
in arbitrary graphs using tools from positive systems can be found in [55] . 

In this Chapter, we describe an optimization-based framework to find the optimal allocation of 
protection resources in weighted and directed networks of nonidentical agents in polynomial time. In our 
study, we consider two types of containment resources: 

• Preventive resources able to protect (or ‘immunize’) nodes against the spreading (such as vaccines 
in a viral infection process). This type of resources are allocated in nodes and/or edges of the 
network before the spread has reached them, so that this element is protected from the spread. 
The effect of this resource is to reduce the rate in which the spread can reach this element. 

• Corrective resources able to neutralize the spreading after it has reached a node (such as antidotes 
in a viral infection). Notice that, in contrast with preventive resources, corrective resources are 
used after the spread has reached a node in the network. The effect of this type of resource is to 
increase the rate of recovery of an elements after the spread has reached it. 

In the framework herein presented, we assume there are cost associated with these resources and 
study the problem of finding the cost-optimal distribution of resources throughout the network to contain 
the spreading. The aforementioned protection resources have an associated cost that depends on the 
level of protection achieved by the resource. For example, the larger the investment on vaccines and 
antidotes, the higher the level of protection achieved by the population in which the resources have been 
distributed. One of the main questions in epidemiology and public health is to find the optimal allocation 
of preventive and corrective resources to contain an epidemic outbreak in a cost-optimal manner. An 
identical question can be asked in the context of designing protection strategies for other cyber-physical 
networks, motivating the main problem covered in this chapter: 

Problem. Find the cost-optimal allocation of preventive and corrective resources to protect a cyber¬ 
physical network against spreading processes. 

In the field of systems reliability, there is a well-developed theory of preventive and corrective main¬ 
tenance for single components or machines, but there is a lack of a theoretical framework to analyze 
large-scale interdependent systems m The state-of-the-art in the reliability analysis of networked sys¬ 
tems is mostly based on Markov models [331E21EI1EQ]. These models usually suffer from scalability 
issues, since the state space grows exponentially fast with the number of components under considera¬ 
tion. Similar Markov models have also been proposed in the analysis of disease spreading in networked 
populations. A rich and growing literature is arising in this context, proposing a variety of approaches 
to find efficient allocation of protection resources to contain an epidemic outbreak. In a series of papers, 
Preciado et al. developed a mathematical framework, based on dynamic systems theory and convex op¬ 
timization, to find the optimal distribution of protection resources in a complex network [63116511621168] . 
In particular, they showed that it is possible to find the cost-optimal distribution of vaccines and anti¬ 
dotes in a (possibly weighted and directed) social network of nonidentical nodes in polynomial time using 
geometric programming [66] . This framework has also be extended to find the allocation of traffic-control 
resources to find the cost-optimal traffic profile in a transportation network to contain the spread of a 
disease among cities [64] . 

1.1 Mathematical Framework 

We introduce notation and preliminary results needed in our derivations. In the rest of the paper, 
we denote by K” (respectively, K"_|_) the set of n-dimensional vectors with nonnegative (respectively. 
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positive) entries. We denote vectors using boldface letters and matrices using capital letters. / denotes 
the identity matrix and 1 the vector of all ones. 5ft (z) denotes the real part of z € C. 

1.1.1 Graph Theory 

A weighted, directed graph (also called digraph) is defined as the triad Q A iy,£,W), where (*) V = 
{vi,... ,Vn} is a set of n nodes, (ii) f C V x V is a set of ordered pairs of nodes called directed edges, 
and {Hi) the function VV : £ K++ associates positive real weights to the edges in £. By convention, 

we say that {vj,Vi) is an edge from Vj pointing towards Vi. We define the in-neighborhood of node 
Vi as A/)™ = {j : {vj,Vi) G £}, i.e., the set of nodes with edges pointing towards Vi. We define the 
weighted in-degree (resp., out-degree) of node Vi as degj„ {vi) = ^ (resp., deg^^j^ (vi) = 

W j other words, the weighted degrees are the sum of the edge weights attached to 

a node. 

The adjacency matrix of a weighted, directed graph Q, denoted by Ag = [a^], is a n x n matrix defined 
entry-wise as Oij = yV{{vj,Vi)) if edge {vj,Vi) G £, and = 0 otherwise Given a, n x n matrix 
M, we denote by vi (M) ,..., v„ (M) and Ai (M) ,..., A„ (M) the set of eigenvectors and corresponding 
eigenvalues of M, respectively, where we order them in decreasing order of their real parts, i.e., 5ft (Ai) > 
3ft (A 2 ) > ... > 5ft (A„). We call Ai (M) and vi (M) the dominant eigenvalue and eigenvector of M. The 
spectral radius of M, denoted by p {M), is the maximum modulus of an eigenvalue of M. 

In this paper, we only consider graphs with positively weighted edges; hence, the adjacency matrix 
of a graph is always nonnegative. Conversely, given a n x n nonnegative matrix A, we can associate 
a directed graph Qa such that A is the adjacency matrix of Qa- Finally, a nonnegative matrix A is 
irreducible if and only if its associated graph Qa is strongly connected. 

In our derivations, we use Perron-Frobenius lemma, from the theory of nonnegative matrices [49) : 

Lemma 1.1.1. (Perron-Frobenius) Let M be a nonnegative, irreducible matrix. Then, the following 
statements about its spectral radius, p{M), hold: 

(a) p{M) > 0 is a simple eigenvalue of M, 

(h) Mu = p {M) u, for some u G R4I+, and 
(c) p (M) = inf {a G K : Mu < Au for u G 

Remark 1.1.1. Since a matrix M is irreducible if and only if its associated digraph Qm is strongly 
connected, the above lemma also holds for the spectral radius of the adjacency matrix of any (positively) 
weighted, strongly connected digraph. 

Corollary 1.1.2. Let M be a nonnegative, irreducible matrix. Then, its eigenvalue with the largest 
real part, Ai (M), is real, simple, and equal to the spectral radius p (M) > 0. 

1.1.2 Stochastic Spreading Model in Arbitrary Networks 

We formulate the simplest version of the problem under consideration using a generalization of the SIS 
model, popularly used to model spreading dynamics in networks, such as the propagation of diseases in a 
networked population [laiiiiiis] or malware in a compute network [361 EH [831 ED. This generalization 
of the SIS model, called Heterogeneous Networked SIS model (HeNeSIS), is a continuous-time networked 
Markov process in which each node in the network can be in one out of two possible states, namely, 
susceptible or infected. In the context of systems reliability, each node in the networked Markov process 
represents a component in a networked infrastructure, and the susceptible and infected states correspond 
to operational and faulty states of these components, respectively. Over time, each node G V in the 
networked Markov process can change its state according to a stochastic process parameterized by (i) 
the edge propagation rate /3y, and {ii) its node recovery rate 5i. In what follows, we shall describe the 
dynamics of the HeNeSIS model. 

The dynamics of the HeNeSIS model can be described as follows. The state of node Vt at time t > 0 is 
a binary random variable Xi {t) G {0,1}. The state Xi {t) = 0 (resp., Xi {t) = 1) indicates that node Vi is 


6 


CHAPTER 1. INTRODUCTION 



Figure 1.1: Networked Markov process with 2 states per node, corresponding to the HeNeSIS spreading 
model. Infected (resp., susceptible) nodes are plotted in red (resp., blue). 


in the susceptible (resp., infected) state. We define the vector of states as X (t) = {Xi (t),..., (t))^. 

The state of a node can experience two possible stochastic transitions: 

(i) Assume node Vi is in the susceptible state at time t. This node can switch to the infected state 
during the time interval [t,t + At) with a probability that depends on: (i) the propagation rates 
{Pij, for j G A/)™}, and (Hi) the states of its in-neighbors {Xj (t ), for j G A/)*"}. Formally, the 
probability of this transition is given by 

Pr(A,(t +At) = l|W(t) =0,A(t)) = ^ (t)At + o(At), (1.1) 

j&NC 

where At > 0 is considered an asymptotically small time interval. 

(ii) Assuming node Vi is infected, the probability of Vi recovering back to the susceptible state in the 
time interval [t, t -|- At) is given by 

Pr(A,(t H- At) = 0|W(t) = 1, X{t)) = (5,At -f o(At), (1.2) 

where > 0 is the curing rate of node Vi. 

In the context of failure propagation in networked infrastructure, (3ij represents the Poisson rate at 
which a failure in the element located at node vj propagates to the element in node Vi. Similarly, 
di represents the Poisson rate at which a fault at component Vi is cleared. This HeNeSIS model is 
therefore a continuous-time Markov process with 2" states in the limit At —>■ O’*". Unfortunately, the 
exponentially increasing state space makes this model hard to analyze for large-scale networks. Using 
the Kolmogorov forward equations and a mean-field approach m, one can approximate the dynamics 
of the spreading process using a system of n ordinary differential equations, as follows. Let us define 
Pi (t) = Pr (Xi (t) = !) = £’ (Xi (t)), i.e., the probability of node Vi being infected (or faulty) at time t. 
Hence, the Markov differential equation ITT] for the state Xi (t) = 1 is the following, 

= (1 - Pi (t)) ^ (t) - 6^p^ (t). (1.3) 

i=i 

Considering j = 1,..., n, we obtain a system of nonlinear differential equation with a complex dynamics. 
In the following, we derive a sufficient condition for the spreading process to die out exponentially 
fast. Let us define the vector p (t) A (p^ (t),... ,p„ (t))’^, and the matrices Bg A [Pij], D A diag((5i). 
Notice that Bg is the weighted adjacency matrix of a weighted, directed graph with edge-weight function 
W (vj Vi) = (lij] in other words, the weights of the directed link from vj to Vi is jlij. The ODE under 
consideration presents an equilibrium point at p* = 0, called the disease-free (or fault-free) equilibrium. 
A stability analysis of this ODE around the equilibrium provides the following stability result [55] : 
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Proposition 1. Consider the nonlinear HeNeSIS model in (1.31 and assume > 0. Then, if the 

eigenvalue with largest real part of Bg — D satisfies 


^[\i{Bg-D)]<-e, (1.4) 

for some £ > 0, the disease-free equilibrium (p* = 0 ) is globally exponentially stable, i.e., |lp(t)|| < 
Up (0)11 itrexp (—£t), for some K > 0. 


1.2 A Quasiconvex Framework for Optimal Resource Allocation 


Assume that the fault propagation and recovery rates, Pij and 6i, are adjustable by allocating protection 
resources on the edges and nodes of the networked Markov process. We consider two types of protection 
resources: (i) preventive resources (e.g., vaccinations in the case of disease spreading), and (ii) corrective 
resources (e.g., antidotes). We assume that the propagation rate (3ij can be reduced using preventive 
resources. Also, allocating corrective resources at node Vi increases the recovery rate <5^. We assume 
that we are able to, simultaneously, modify the fault propagation and recovery rates of Vi within feasible 
intervals 0 < /3.. < Pij < j3ij and 0 < < di < < A, where A is an uniform upper bound in the 
achievable recovery rate, which is assumed to be known a priori. The particular values of Pij and Si 
depend on the amount of preventive and corrective resources allocated at node Vi. We consider that 
protection resources have an associated cost. We define two cost functions, the prevention (or vaccination) 
cost function (Pij) and the correction (or antidote) cost function pi (Si), that account for the cost of 


tuning the fault propagation and recovery rates to (lij G 




and Si G [dj,<5i], respectively. 


In this context of protection design, one can study a type of resource allocation problems, called 
the budget-constrained allocation problem. In the budget-constrained problem we are assigned a total 
budget C to invest on protection resources and we need to find the best allocation of preventive and/or 
corrective resources to maximize a measure of the network resilience. In |65j and [66) . the authors 
proposed a measure of the network resilience based on the norm of the vector of probabilities of fault 
probabilities, ||p(t)||. In particular, the exponential rate of decay of such a vector is a measure of the 
ability of the networked infrastructure to recover from random failures. In other words, assuming that 
we are able to control the system to satisfy the condition ||p (t)|| < ||p (0)|| K exp {—et), the exponential 
decay rate e measures the ability of the networked infrastructure to ‘self-heal’ from random contingencies. 

Based on Proposition]^ the decay rate of an epidemic outbreak is determined by e in (1.4|. Thus, 
given a budget C, the budget-constrained allocation problem is formulated as follows: 


Problem 1. [Budget-constrained allocation) Given the following elements: (i) A directed network Q = 
(V,£) representing failure dependencies between components in a networked infrastructure, (ii) a set of 
cost functions fij [jdij),gi (Si), [Hi) bounds on the fault propagation and recovery rates 0 < < j3ij < /3j^- 

and 0 < Si < Si < Si, and (iv) a total budget C, find the cost-optimal distribution of (preventive and 
corrective) protection resources to maximize the exponential decay rate e. 

Based on Proposition]^ we can state this problem as the following optimization program: 


maximize e (1-5) 

subject to 3? [Ai [Bg — D)] < —e, (1-6) 

E (1-7) 

< fij < fij, [j,i) G S^<s,< Si, i G V, (1.8) 


where (1.7) is the budget constraint. 
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In the following section, we propose an approach to find the optimal budget-constraint allocation in 
polynomial time for weighted and directed contact networks, under certain convexity assumptions on 
the cost functions fij and gi. 

1.2.1 A Geometric Programming Approach 

We propose a convex formulation to solve the budget-constrained in weighted, directed networks using 
geometric programming (GP) [9j. Geometric programs are a type of quasiconvex optimization problems 
that can be easily transformed into convex programs and solved in polynomial time. We start our 
exposition by briefly reviewing some concepts used in our formulation. Let Xi,...,a;„ > 0 denote n 
decision variables and define x = (xi,..., x„) G ®++- In the context of GP, a monomial h{x.) is defined 
as a real-valued function of the form /i(x) = ... x®" with d > 0 and S M. A posynomial 

function ^(x) is defined as a sum of monomials, i.e., g(x) = c/ca;“^'‘X 2 ^'' ...x®"*", where Ck > 0. 

Posynomials are closed under addition, multiplication, and nonnegative scaling. A posynomial can be 
divided by a monomial, with the result a posynomial. 

A geometric program (GP) is an optimization problem of the form (see [5] for a comprehensive 
treatment): 


minimize /(x) (1-9) 

subject to qi(x) < 1, i = l,...,m, 
hi{x) = 1, i = l,...,p, 


where qi are posynomial functions, hi are monomials, and / is a convex function in log-seal^ A GP is 
a quasiconvex optimization problem [5] that can be transformed to a convex problem. This conversion 
is based on the logarithmic change of variables yi = logXi, and a logarithmic transformation of the 
objective and constraint functions (see [S] for details on this transformation). After this transformation, 
the GP in (1.9) takes the form 


minimize F (y) (1-10) 

subject to Qi (y) < 0, f = 1,..., m, 

b^y+ log(ii = 0, i = l,...,p. 


where Qi (y) = loggi(expy) and F (y) = log/ (expy). Also, assuming that hi (x) = dixj^’‘x 2 ^’’ . .. Xn"’*, 
we obtain the equality constraint above, with = (6i,i,... ,bn^i), after the logarithmic change of vari¬ 
ables. Notice that, since / (x) is convex in log-scale, F (y) is a convex function. Also, since qi is a 
posynomial (therefore, convex in log-scale), Qi is also a convex function. In conclusion, (1.10) is a 
convex optimization problem in standard form and can be efficiently solved in polynomial time [9 . 

To solve Problem 111 using GP, it is convenient to define the ‘complementary’ recovery rate Si = A —Si. 
We can also define a ‘complementary’ recovery cost function as g^ = g, (Si) = g, (a - j; in other 


words, instead of defining the recovery cost in terms of the recovery rate. Si, we define it in terms of its 
complementary value. Si. Hence, Problem]^ can be formulated as a GP if the cost functions fij {(Sij) and 

gi [Si] are posynomials (see [5], Section 8, for a treatment about the modeling abilities of monomials and 


posynomials). Therefore, the total cost function (/^b) + ^ posynomial. 

In [55], Problem]^ is transformed into a GP, using results from the theory of nonnegative matrices and 
the Perron-Frobenius lemma. The resulting formulation is described below [55]: 


Theorem 2. Consider the following elements: 


(i) A directed graph Q = (V,£) representing failure dependencies in a networked infrastructure. 


^Geometric programs in standard form are usually formulated assuming / (x) is a posynomial. In our formulation, we 
assume that / (x) is in the broader class of convex functions in logarithmic scale. 
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(ii) Posynomial cost functions {fij and | 

(Hi) Bounds on the failure propagation and recovery rates 0 < < /3ij < /3^j and 0 < < Si < Si < A. 

(iv) A maximum budget C to invest in protection resources. 

Then, the optimal allocation of protection resources on edge {vj,Vi) is given by fij {Pij) and the optimal 
allocation of recovery resources at node Vi is gi — S*J, where /3fj,5f are the optimal solution of the 
following GP: 


minimize^ A (1-11) 

n 

subject to ''^^PijUj + SiUi < Xui, (1-12) 

{j,i)GE 

< Pij < Pij, ij,i) eS] A-S^<S^<A-S„ iG V, (1.14) 


It is easy to verify that the above formulation is a GP; hence, it can be efficiently transformed into 
a convex optimization program and solved in polynomial time, [66]. The tools presented are illustrated 
with a numerical simulation involving the world-wide air transportation network. 


1.2.2 Controlling Epidemic Outbreaks in a Transportation Network 


We apply the above results to the design of a cost-optimal protection strategy against epidemic outbreaks 
that propagate through the air transportation network m- We analyze real data from the world-wide 
air transportation network and find the optimal distribution of vaccines and antidotes to prevent the viral 
spreading of an epidemic outbreak. We consider the budget-constrained problems in our simulations. 
We limit our analysis to an air transportation network spanning the major airports in the world, in 
particular, we consider only airports having an incoming traffic greater than 10 million passengers per 
year (MPPY). There are 56 such airports world-wide and they are connected via 1,843 direct flights, 
which we represent as directed edges in a graph. To each directed edge (z, j), we assign a ‘contact’ weight, 
Oji, equal to the number of passengers taking that flight throughout the yeai|^(in MPPY units). 

In this problem, we assume that allocating preventive resources (e.g. vaccines) at a particular airport, 
scale down the propagation rate of all the incoming links in proportion to the incoming traffic. In other 
words, we assume that j3ij = jdiaij, where a^- is the number of passengers per year (in MPPY) that 
travel from airport Vj to airport Vi, and jit is a scaling factor that depends on the destination airport 
only. In our simulations, we consider the following cost functions fij {(3ij) = f {l3i) and gi (Si) = g{Si), 
where / and g are the following functions (plotted in Figure 1.2): 


fl-i _ «-i 

h (A) = El ^ - - 


A” -- 


-1 • 


gi (<Jz) = 


(i-d,)-'-(i-A)-' 


(i-A) -ii-Si)- 


(1.15) 


Notice that as we increase the amount invested on vaccines, the propagation rate of that node is reduced 
from Pi to (red line). Similarly, as we increase the amount invested on antidotes at a node Vi, the 
recovery rate grows from Si to Si (blue line). Notice that both cost functions present diminishing marginal 
benefit on investment. 

^Although we could have chosen other functions of the trafhc to design these contact weights, we illustrate our framework 
using this simple set of weights. Using a different, possibly nonlinear functions, to generate these weights do not influence 
the tractability of our framework. 
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Figure 1.2: Propagation rate (in red, and multiplied by 20, to improve visualization) and recovery rate 
(in blue) achieved at node Vi after an investment on protection (in abscissas) is made on that node. 


Using the air transportation network and the cost functions specified above, we solve the budget- 
constrained allocation problem using the geometric programs in Theorems In the left subplot of Figure 
1.3 we present a scatter plot with 56 circles (one circle per airport), where the abscissa of each circle is 
equal to g (S *) and the ordinate is / (/3 *), namely, the investments on allocation of vaccines and antidotes 
on the airport at node Vi, for all Vi € V. We observe an interesting pattern in the allocation of preventive 
and corrective resources in the network. In particular, we have that in the optimal allocation some 
airports receive only corrective resources (indicated by circles located on top of the x-axis), and some 
airports receive a mixture of preventive and corrective resources. In the center and right subplots in Fig. 
1.3 we compare the distribution of resources with the in-degree and the PageRanl^^ centralities of the 


nodes in the network m- In the center subplot, we have a scatter plots where the ordinates represent 
investments on prevention (red +’s), correction (blue x’s), and total investment (the sum of prevention 
and correction investments, in black circles) for each airport, while the abscissas are the (weighted) 
in-degree^ of the airports under consideration. We again observe a nontrivial pattern in the allocation 
of investments for protections. In particular, for airports with incoming traffic less than 4 MPPY, only 
corrective resources are needed. Airports with incoming traffic over 4 MPPY receive both preventive and 
corrective resources. In the right subplot in Fig. |1.3| we include a scatter plot of the amount invested on 
prevention and correction for each airport versus its PageRank centrality in the transportation network. 
We observe that there is a strong correlation between the network centrality measures and the level 
of investment per node. In particular, there is an almost affine relationship between the total level of 
investment (black circles in Fig. |I.3[ center) and the incoming traffic of an airport. Furthermore, there is 
a clear piece-wise linear affine relationship between the levels of investment on prevention and correction 
(Fig. |I.3[ left). Similar relationships also hold when comparing the levels of investment versus the 


Page-Rank centralities in the airport network (Fig. 1.3 right). 

Notice that the above distribution of protection resources correspond to the particular cost functions 
chosen for our simulations. Changes in these cost functions allow us to observe interesting phenomena 
in the optimal distribution of protection resources, such as airports with a zero protection assignment 
at optimality, or a distribution of resources with a negative correlation with centrality measures. For 
example, it is possible to build cases in which nodes with low centrality (e.g. nodes with low incoming 
traffic and PageRank) are assigned at optimality a higher level of protection than more central nodes 


®The PageRank vector r, before normalization, can be computed as r = (7 — QAgdiag(l/deg^^^j Oi))) ^1) where 1 is 

the vector of all ones and a is typically chosen to be 0.85. _ 

'‘It is worth remarking that the in-degree in the abscissa of Fig. EH accounts from the incoming traffic into airport Vi 
coming only from those airports in the selective group of airports with an incoming traffic over 10 MPPY. Therefore, the 
in-degree does not represent the total incoming traffic into the airport. 
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Figure 1.3: Results from the budget-constrained allocation problem. From left to right, we have (a) a scatter 
plot with the investment on correction versus prevention per node, (6) a scatter plot with the investment on 
protection per node and the in-degrees, and (c) a scatter plot with the investment on protection per node versus 
PageRank centralities. 


1.3 Towards a General Framework for Network Protection 

The framework presented in this chapter has been recently extended in several directions. In what 
follows, we briefly describe the following extensions: (*) a generalized framework to cover more realistic 
epidemic models (beyond SIS), {ii) a novel data-driven framework able to handle network uncertainties, 
and (Hi) an analysis tool that allows us to study non-Poissonian transmission and recovery rates. 

1.3.1 Generalized Epidemic Models 

In Nowzari et al. [52l [12], the authors recently studied a model of spreading, called the Generalized 
Susceptible-Exposed-Infected-Vigilant (G-SEIV) model, that generalizes many of the models in the liter¬ 
ature, including SIS, SIR, SIRS, SEIR, SEIV, SEIS, and SIV [5011301 ■ This model has two two infectious 
states, called Infected (I) and Exposed (E), that allow us to model human behavioral changes. An in¬ 
dividual is in the Exposed state if she is infected and contagious, but not yet aware that she is sick 
(i.e., in an asymptomatic incubation period). Individuals in the Infected are infected and aware of the 
disease, which induces a different behavior. Eor instance, a person knowingly infected with a disease 
may have much less contact with others, yielding less chance of spreading the infection. The dynamics 
of this model is described below. The G-SEIV model also includes a Vigilant (V) state, which represents 
healthy individuals being aware of the disease being spread. Hence, individuals in the Vigilant state are 
more careful in their social contacts and less likely to be infected. 

Let us denote by [S'i(t), ifi(t), A(t), E(i)]^ the probability vector associated with node i being in 
each one of these states: Susceptible, Exposed, Infected, or Vigilant, respectively. Using a mean-field 
approximation, the dynamics of the G-SEIV model can be described as: 


m = 7,E(t) - eA{t) - S,{t) j ^ f3fE,{f) + f3ll,[t) 

Mt) = S.{t) I ^ /3fE,(t) + /3//,(t) -e^E^it) 

\jeNr / 

ii{t) = SiEiit) - SJ^(t) 

Vi{t) = 5Ji{t) + 0iS,{t) - j^Vift). 


(1.16) 
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Figure 1.4: Generalized Susceptible-Exposed-Infected-Vigilant model in a network of individuals. (Source 
of figure: [52]1. 


Using nonlinear analysis techniques, Nowzari et al. derived the following necessary and sufficient 
condition for the disease to die out exponentially fast: 

Theorem 1.3.1. (Conditions for stability of disease-free equilibrium) The disease-free equilib¬ 
rium of the G-SEIV model is globally exponentially stable if and only if the following matrix, 

is Hurwitz, where = diag (/?^) , B^ = diag (/3^) ,D = diag (6) ,E = diag (e), T = diag 

The above result can be used to mitigate, or eliminate completely, the spreading of the disease. 
In [52], the authors considered three types of resources are available to control the disease: corrective 
resources (e.g., antidotes), preventative resources (e.g., vaccines), and preemptive resources (e.g., aware¬ 
ness campaigns and/or limiting traffic). Under mild conditions on the cost functions of these resources, 
the authors were able to bound the rate of spreading of the undesired disease. 


1.3.2 Data-Driven Allocation 

Although current vaccination strategies assume full knowledge about the network structure and spreading 
rates, in most practical applications, this information is only partially known. To elaborate on this point, 
let us consider the following setup. Assume that each node in the network represents subpopulations 
(e.g., city districts) connected by edges that are determined by commuting patterns between districts. 
In practice, one can use traffic information and geographical proximity to infer the existence of an 
edge connecting districts. For example, in Fig. |1.5[ we represent such a network for those districts 
in West Africa affected by the 2014 Ebola outbreak. On the other hand, it is very challenging to use 
this information to estimate the contact rates between subpopulations. Inspired by this example, we 
considered in |28| a networked SIS model taking place in a contact network of unknown contact rates. To 
extract information about these unknown rates, we assumed that we have access to time series describing 
the evolution of the spreading process observed from a collection of sensor nodes during a finite time 
interval. 

In contrast to current network identification heuristics, in which a single network is identified to 
explain the observed data, the authors in |28j developed a robust optimization framework in which 
an uncertainty set containing all networks that are coherent with empirical observations is defined. 
This characterization of the uncertainty set of networks is tractable in the context of conic geometric 
programming^ recently proposed by Chandrasekaran and Shah m- In this context, the authors were able 
to efficiently find the optimal allocation of resources to control the worst-case spread that can take place 
in the uncertainty set of networks. In order to extract information about the contact rates, the authors 
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1 Date 

Guinea 



Cases 

Deaths 

22 Mar 2014 

49 

29 

24 Mar 2014 

86 

59 

25 Mar 2014 

86 

60 

26 Mar 2014 

86 

62 

27 Mar 2014 

103 

66 

28 Mar 2014 

112 

70 

31 Mar 2014 

122 

80 

01 Apr 2014 

127 

83 

04 Apr 2014 

143 

86 

07 Apr 2014 

151 

95 

09 Apr 2014 

158 

101 

11 Apr 2014 

159 

106 

14 Apr 2014 

168 

108 

16 Apr 2014 

197 

122 

17 Apr 2014 

203 

129 

20 Apr 2014 

208 

136 

23 Apr 2014 

218 

141 

01 May 2014 

226 

149 

03 May 2014 

231 

155 


Figure 1.5: Network of districts in West Africa affected by the 2014 Ebola outbreak. (Source: WHO) 


considered two different sources of information that are usually available in epidemiological problems. 
These sources can be classified as (i) prior information about the network topology and parameters 
of the disease, and (ii) empirical observations about the spreading dynamics. In particular, one can 
consider the following pieces of prior information: 


(i) Assume that the sparsity pattern of the contact matrix Bg is given, although its entries are un¬ 
known. This piece of information may be inferred from geographical proximity, commuting pat¬ 
terns, or the presence of transportation links connecting subpopulations. 


(ii) Assume that upper and lower bounds on the spreading rates associated to each edge, i.e., Pi 


ulation sizes. 


, for all (i,j) G are available. This could be inferred from traffic densities and subpop- 


(iii) In practice, each district contains a large number of individuals. Therefore, one can use the average 
recovery rate in the absence of vaccination as an estimation of the nodal recovery rate. We denote 
this ‘natural’ recovery rate by (5°, and assume it to be known. 


Apart from these pieces of prior information, the authors in [28] also assumed that they had access to 
partial observations about the evolution of the spread over a finite time interval. In particular, assume 
that we observe the dynamics of the disease for t G [0,T] from a collection of sensor nodes V 5 C V. 
Based on these pieces of information, one can define an uncertainty set that contains all contact matrices 
Bg consistent with both empirical observations and prior knowledge. This set contains those contact 
matrices Bg such that the transmission rates {Pij} are consistent with the disease dynamics. 

In order to eradicate the disease at the fastest rate possible, the authors in |2H] considered the 
following control problem: 


Problem 3. (Data-driven optimal allocation) Assume the following pieces of information about a viral 
spread are given: 

(i) prior information about the state matrix (as described in PI-PS); 

{ii) a finite (and possibly sparse) data series representing partial evolution of the spread over a set of 
sensor nodes Vs C V during the time interval t G [T] (i.e., V in 


(Hi) a set of vaccine cost functions gi for all i G Vc, and a range of feasible recovery rates 



such that I — = s’) > S) > 5) > 0; 
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(iv) a fixed budget C > 0 to be alloeated throughout a set of control nodes in Vc C V, so that 
EzGVc5*('^f) < C. 

Find the cost-constrained allocation of control resources to eradicate the disease at the fastest possible 
exponential rate, measured as p(AI(Bg,d°)), over the uncertainty set Asg of contact matrices coherent 
with prior knowledge and the observations in V. 

From the perspective of optimization, Problem is equivalent to finding the optimal allocation of 
resources to minimize the worst-case (i.e., maximum possible) decay rate d"^)) for all Bg G Asg- 

In Han et al. |28j . a robust optimization framework was developed to solve this problem, even in the 
presence of sparse observations. 

1.3.3 Non-Poissonian Rates 

The vast majority of spreading models over networks assume exponentially distributed transmission and 
recovery rates. In contrast, empirical observations indicate that most real-world spreading processes do 
not satisfy this assumption [151IT71I15] . For example, the transmission rates of human immunodeficiency 
viruses present a distribution far from exponential [5]. In the context of online social networks, empirical 
studies show that the rate of spreading of information follow (approximately) a log-normal distribution 

[13 [75]. 

There are only a few results available for analyzing spreading processes over networks with non¬ 
exponential transmission and recovery rates. The experimental study in m confirmed the dramatic 
effect that non-exponential rates can have on the speed of spreading, as well as on the epidemic threshold. 
In [34] , an analytically solvable (although rather simplistic) model of spreading with non-exponential rates 
was proposed. An approximate criterion for epidemic eradication over graphs with general transmission 
and recovery times based on asymptotic approximations was proposed in [la¬ 
in the recent work [541156] , the authors propose an alternative approach to analyze general transmis¬ 
sion and recovery rates using phase-type distributions. In particular, they derive conditions for disease 
eradication using transmission and recovery times that follow phase-type distributions (see, e.g., 0 )- 
The class of phase-type distributions is dense in the space of positive-valued distributions m, hence, it 
can be used to theoretically analyze arbitrary transmission and recovery rates. Furthermore, there are 
efficient algorithms to compute the parameters of a phase-type distributions to approximate any given 
distribution [ 3 . The key tool in this analysis is a vectorial representations proposed in which can 
be used to represent phase-type distributions. 


1.4 Comparisons with Common Heuristics 

Usual approaches to distribute protection resources in a network of agents susceptible to cascade failures 
are heuristics based on network centrality measures m- As in the optimal framework presented in the 
previous section, much of the literature uses a bio-inspired epidemic models when studying harmful pro¬ 
cess with the ability to spread between interconnected agents. The main idea behind heuristic protection 
strategies is to rank agents according to different measures of importance based on their location in the 
network and greedily distribute protection resources based on each agents rank. For example, Cohen 
et al. |I8j proposed a simple protection strategy called acquaintance immunization policy in which the 
most connected node of a randomly selected node is given protective resources. This strategy was proved 
to be much more efficient than random allocation of protective resources. Hayashi et al. [53 proposed 
a simple heuristic called targeted immunization consisting on greedily choosing nodes with the highest 
degree (number of connections) in scale-free graphs. Chung et at. [T7| studied a greedy heuristic pro¬ 
tection strategy based on the PageRank vector of the contact graph. Tong et al. m and Giakkoupis et 
al. [25] proposed greedy heuristics based on protecting those agents that induce the highest drop in the 
dominant eigenvalue of the contact graph. Recently, Prakash et al. [3 proposed several greedy heuris¬ 
tics to contain harmful cascades in directed networks when nodes can be partially protected (instead of 
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completely removed, as assumed in previous work). These heuristics, as those in [751129) . are based on 
eigenvalue perturbation analysis. 

The heuristic methods in the literature are designed for a single resource type, predominantly the 
protective resources. A simplified variant of the budget-constrained allocation problem is presented with 
only protection-type resources in order to compare the optimal solution with heuristic solutions. 

Problem 4. The Network Protection Problem is given by 

max e 

s.t. — 61)] < —e 

n 

i=l 

P < Pi < p yi gv. 


1.4.1 Greedy, Centrality Based Strategies 

Definition 1.4.1. Extract the effective objective in Problem which is induced by the epigraph 
form. Define 

e{P) = -iR[Xi{BA-dI)] (1.18) 

where B = diag(/?) for any feasible resource allocation /3. 

Monotonicity and continuity of the function e(/3) guarantee that fixing any feasible /3 and maximizing 
over e always causes the constraint 3?[Ai(i34 — (5/)] < —e to become tight. At the optimal point 
of Problem |4] satisfies 

e* = -3fi[Ai(diag(/3*)A - ^J)]. 

When solving the resource allocation /?, e(/3) is treated as the effective objective in Problem]^ 
Definition 1.4.2. Define the efficiency of a feasible resource allocation /3 as 


QiP) 


£(/ 3 ) 

e{P*) 


-siP) 


-e(/3) 


G [0,1] 


(1.19) 


where P* is a resource allocation achieving the maximum in Problem 

The effective objective e{P) and the costs functions f{Pi) are monotonically non-increasing in the 
resource allocations Pi at each node, therefore P trivially achieves the minimum over the set of feasible 
resource allocations p. 


Definition 1.4.3. Let u be a centrality vector. Given a budget sufficient to completely protect k nodes: 
C = kf{P), the greedy protection strategy Py is to completely protect k nodes with the highest values 
in V. Define the protection fraction: r = k/N where N = n -\- m is the the total number of nodes. 

Common centrality measures used for heuristics are degree and eigenvector centrality, [29] . Page rank 
centrality is used as in place of eigenvector centrality in the case of general digraphs, HU. While Page 
rank depends on a parameter a, we drop the a from our notation because our results hold for the whole 
family of Page rank vectors generated by non-trivial choices of a S (0,1). 


Theorem 1.4.1. Given the Network Protection Problem defined in Problem]^ with budget C, there 
exists a network G satisfying 

Q{Pdeg) = Q{Ppr) = 0 

where r S (0,1) is the fraction of nodes that can be fully protected. 
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O O O 

oSno 

o o o 


n-node empty network 


all edges from Sn to Cm 
are present 



m-node directed cycle 


Figure 1.6: We construct the network G to 
prove theorem|1.4.1| 


Theorem EM] is based on a worst case graph construc¬ 
tion as shown in Fig. |1.6[ Define the subgraph Cm as an m 
node directed cycle, Sn as an n node empty network and the 
there are edges from all nodes i £ Sn to all nodes j S Cm- 
Formally, the edge set is given by 


(z, j) G £ if any of 


Z G Sn , j G Cm 

i, j — ^ ~f~ f G Cm (1-20) 

i = m + n,j = n + 1 G Cm 


and in all other cases (z,j) ^ £■ All weights are given by 
W(bj) = 1 for (z,j) G £. Given a vaccination fraction r, 
the size of the subgraphs Cm and S'„ must satisfy m > n + 2 
and rN < n in order to generate a network for which greed 
heuristics have zero efficiency. Such m and n exist for any 
r e (0,1) but the networks required become very large as 
r —>■ 1. 

The weakly connected network G results in a spreading 
process dominated by the nodes in Cm even though nodes 
in Sn have larger centralities. A generalization of Theorem 
|1.4.1 which builds a strongly connected network with arbi¬ 
trarily small efficiency can be found in [86] . 


Remark 1.4.1. The proof of Theorem 1.4.1 makes use of a constructive example for the centrality 
measures which identify nodes which are the most likely to fail: (a) out degree and (b) Page rank with a 
random walk defined as moving up the edges. If one uses centrality measures which identify nodes which 
would be the most potent seeds such as (c) in degree or (d) Page rank computed using a random walk 
that flows down the edges, one can construct an alternative G by simply reversing the direction of the 
edges from S'„ to Gm- Using this alternative network, one can reproduce Theorem 1.4.1 for (c) and (d). 


1.4.2 Greedy Heuristics and Workstation Protection 


Network A 


Network G 


Consider a simple application in which such a 
worst case network might arise naturally: nodes 
are computers belonging to individuals in a work 
environment. Edges indicate access to files on an¬ 
other persons computer. 

Each workstation in Gm is an element in the cy¬ 
ber layer, paired with one or more plants in the 
physical layer. Workstations in Sn exist only in 
the cyber layer and belong to a group of adminis¬ 
trators who can access files on all workstations in 
Gm- 

Workers have limited access to each others files, 
but do not have access to files on the administra¬ 
tor’s computers. A virus may spread when an un¬ 
infected computer accesses an infected computer. 
It is assumed that an infected workstation cannot 
adequately control its associated plant which leads 
to physical layer failures. Protection resources 
take the form of antivirus software with updates 
on a variable time interval, software updated more frequently providing a smaller infection rate fd but 
updates incurring a greater cost /(/3). The cost function 



Figure 1.7: Network G with vertices 53 = {1, 2, 3} and 
Cq = {4, 5,... , 9} satisfies the conditions for the counter 
example network defined in Theorem EMl In Network 
A the subgraph on Ce is relaxed to be less structured for 
demonstration purposes. 
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Network A (Cm = Undirected Network) 


Network G (Cm 

: Directed Cycle) 



Out Degree 
Centrality 

Total Degree 
Centrality 

Page Rank 
Centrality 

Sym Page Rank 
Centrality 

Out Degree 
Centrality 

Total Degree 
Centrality 

Page Rank 
Centrality 

Sym Page 
Rank Centrality 


node 1 (Sn) 

6 

6 

0.289 

0.093 

6 

6 

0.299 

0.122 


node 2 (Sn) 

6 

6 

0.289 

0.093 

6 

6 

0.299 

0.122 


node 3 (Sn) 

6 

6 

0.289 

0.093 

6 

6 

0.299 

0.122 

Centrality 

Measures 

node 4 (Cm) 

2 

7 

0.020 

0.105 

1 

5 

0.017 

0.106 

node 5 (Cm) 

3 

9 

0.023 

0.128 

1 

5 

0.017 

0.106 

node 6 (Cm) 

2 

7 

0.020 

0.104 

1 

5 

0.017 

0.106 


node 7 (Cm) 

3 

9 

0.023 

0.128 

1 

5 

0.017 

0.106 


node 8 (Cm) 

2 

7 

0.020 

0.104 

1 

5 

0.017 

0.106 


node 9 (Cm) 

4 

11 

0.026 

0.151 

1 

5 

0.017 

0.106 

Legend 

Nodes selected for protection 






Network A (Cm Undirected Network) 



Network G (Cm= 

Directed Cycle) 




No Allocation 

Out Degree 
Allocation 

Total Degree 
Allocation 

Page Rank 
Allocation 

Sym Page Rank 
Allocation 

Optimal 

Allocation 

No Allocation 

Out Degree 
Allocation 

Total Degree 
Allocation 

Page Rank 
Allocation 

Sym Page Rank 
Allocation 

Optimal 

Allocation 


node 1 (Sn) 

0.5 

0.01 

0.5 

0.01 

0.5 

0.5 

0.5 

0.01 

0.01 

0.01 

0.01 

0.5 


node 2 (Sn) 

0.5 

0.01 

0.5 

0.01 

0.5 

0.5 

0.5 

0.01 

0.01 

0.01 

0.01 

0.5 


node 3 (Sn) 

0.5 

0.01 

0.5 

0.01 

0.5 

0.5 

0.5 

0.01 

0.01 

0.01 

0.01 

0.5 

strategy 
profile p 

node 4 (Cm) 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0261 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 

node 5 (Cm) 

0.5 

0.5 

0.01 

0.5 

0.01 

0.0174 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 

node 6 (Cm) 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0261 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 


node 7 (Cm) 

0.5 

0.5 

0.01 

0.5 

0.01 

0.0174 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 


node 8 (Cm) 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0261 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 


node 9 (Cm) 

0.5 

0.5 

0.01 

0.5 

0.01 

0.0131 

0.5 

0.5 

0.5 

0.5 

0.5 

0.0196 

convergence 
rate e 

- 

1-1.11 

-1.11 

0.152 

-1.11 

0.152 

0.248 

-0.2 

-0.2 

-0.2 

-0.2 

-0.2 

0.28 

Efficiency 

- 

0 

0 

0.929 

0 

0.929 

1 

0 

0 

0 

0 

0 

1 

Legend 

unstable rate 

stable rate 


optimal allocation 










Figure 1.8: (Top) A variety of centrality measures are used as the basis for greedy algorithms, these measures 
are reported for the Networks A and G. (Bottom) The allocation strategies tested are detailed, their exponential 
convergence rate bounds e and their efficiencies are reported for comparison purposes. For the case of the counter 
example network G, none of the greedy type algorithms yield a stable convergence rate. 


/(ft) = 


P-P 


( 1 . 21 ) 


is chosen to satisfy f{P) = 0, /(/3) = 1 and /(/3) (x 1//3. This allows us to choose capacity C equal to 
the number of nodes we wish to be able to allocate maximum protection. In our example the infection 
rate with outdated anti-virus software is /3 = .5 while the maximum update rate achieves an infection rate 
oi P = .01. Choosing a budget of C = 3 for a network with n = 3 and m = 6 (such as in G or A shown 


in Fig. 1.7), the fraction of nodes that can be maximally protected is r = 1/3. An infected machine has 


recovery rate S = 0.3, based on curative resources in the form of IT staff, which are uniformly available. 

In the example, four heuristic algorithms based on greedily allocating resources with respect to 
centrality measures are considered. The centrality measures are out degree, total degree. Page rank with 
a = .1 and symmetrized Page rank with a = .1. Symmetrized Page rank is computed by allowing the 
random walk move over a directed edge in either direction. The worst case networks are products of 
extreme asymmetry between Cm and 5'n, the symmetric centrality measure show that even symmetric 
centrality measure don’t overcome the potential for arbitrarily poor behavior. In Fig. |1.8| the top table 
shows all of the centrality vectors for the example problem in the networks A and G. The network G is the 
network constructed in our analytical proofs. The network A is an example of a less structured employee 
collaboration network which we include to demonstrate two points: (i) our constructed network G is not 
unique and (ii) symmetrizing heuristics are less fragile than heuristics that respect edge direction. 

In G and A the out degree and Page rank heuristics allocate all resources to the admins, Sn- This is 
ineffective because even though the admins are the most likely to become infected the worker group, Gm 
cannot access their files and become infected. Furthermore, the failure of admin workstation does not 


lead directly to physical layer failures. Fig. 1.8 (bottom) shows the infection rate profiles generated by 


the various heuristics and the optimal solution. A strategy is ineffective if the convergence rate epsilon 
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is negative because this corresponds to unstable dynamics, where the computer virus is spreading faster 
than the IT staff can repair workstations. The result is a complete failure to consistently control any of 
the plants in the physical layer of the system. 

1.5 Conclusions 

We have studied the problem of allocating protection resources in weighted, directed networks to contain 
spreading processes, such as the propagation of viruses in computer networks, cascading failures in 
complex technological networks, or the spreading of an epidemic in a human population. We have 
considered two types of protection resources: (i) Preventive resources able to ‘immunize’ nodes against 
the spreading (e.g. vaccines), and (ii) corrective resources able to neutralize the spreading after it has 
reached a node (e.g. antidotes). We assume that protection resources have an associated cost and 
have then studied the budget-constrained allocation problem, in which we find the optimal allocation of 
resources to contain the spreading given a fixed budget. We have solved this optimal resource allocation 
problem in weighted and directed networks of nonidentical agents in polynomial time using Geometric 
Programming (GP). Furthermore, the framework herein proposed allows simultaneous optimization over 
both preventive and corrective resources, even in the case of cost functions being node-dependent. 

We have illustrated our approach by designing an optimal protection strategy for a real air trans¬ 
portation network. We have limited our study to the network of the world’s busiest airports by passenger 
traffic. For this transportation network, we have computed the optimal distribution of protecting re¬ 
sources to contain the spread of a hypothetical world-wide pandemic. Our simulations show that the 
optimal distribution of protecting resources follows nontrivial patterns that cannot, in general, be de¬ 
scribed using simple heuristics based on traditional network centrality measures. 

We then presented the following recent extensions on this work: (i) a generalized framework to cover 
more realistic epidemic models, (ii) a novel data-driven framework able to handle network uncertain¬ 
ties, and (Hi) an analysis tool that allows us to study non-Poissonian transmission and recovery rates. 
We concluded this chapter with a comparison between our results and common heuristics used in the 
literature. 
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Exercises 

Consider the following three networks with n nodes: 



.2 



Answer the following questions: 

Question 1. Compute the largest eigenvalue of the adjacency matrices of the graphs in figures (a), 
(b) and (c) as a function of n. 

Question 2. Consider the SIS model of spreading with (3 = 0.1 and n = 100. For what values of 5 
does an epidemics die out in for each one of the three networks above? (Reminder: The epidemic dies 
out when Ai < 5/13). 

Question 3. Imagine you work for a health agency responsible for controlling an epidemic taking 
place in the above networks. Assume you can tune the spreading rates of the edges within a feasible 
interval [13,13]. Assume the cost associated with tuning j3 is given by fij{(3) = 1/13. Write the associated 
optimization problem for each one of the above networks. (Hint: Your answer should look like equations 

I.SHl.f 


Question /. Transform the optimization problem in Question 3 into a standard geometric program. 
(Hint: Your answer should look like equations 1.11-1.141. 


Question 5. Implement the geometric program from Question 4 using MATLAB’s CVX Toolbox m- 
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